14 research outputs found
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood
Our goal is to learn a semantic parser that maps natural language utterances
into executable programs when only indirect supervision is available: examples
are labeled with the correct execution result, but not the program itself.
Consequently, we must search the space of programs for those that output the
correct result, while not being misled by spurious programs: incorrect programs
that coincidentally output the correct result. We connect two common learning
paradigms, reinforcement learning (RL) and maximum marginal likelihood (MML),
and then present a new learning algorithm that combines the strengths of both.
The new algorithm guards against spurious programs by combining the systematic
search traditionally employed in MML with the randomized exploration of RL, and
by updating parameters such that probability is spread more evenly across
consistent programs. We apply our learning algorithm to a new neural semantic
parser and show significant gains over existing state-of-the-art results on a
recent context-dependent semantic parsing task.Comment: Proceedings of the 55th Annual Meeting of the Association for
Computational Linguistics (2017
PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions
The remarkable capabilities of large language models have been accompanied by
a persistent drawback: the generation of false and unsubstantiated claims
commonly known as "hallucinations". To combat this issue, recent research has
introduced approaches that involve editing and attributing the outputs of
language models, particularly through prompt-based editing. However, the
inference cost and speed of using large language models for editing currently
bottleneck prompt-based methods. These bottlenecks motivate the training of
compact editors, which is challenging due to the scarcity of training data for
this purpose. To overcome these challenges, we exploit the power of large
language models to introduce corruptions (i.e., noise) into text and
subsequently fine-tune compact editors to denoise the corruptions by
incorporating relevant evidence. Our methodology is entirely unsupervised and
provides us with faux hallucinations for training in any domain. Our Petite
Unsupervised Research and Revision model, PURR, not only improves attribution
over existing editing methods based on fine-tuning and prompting, but also
achieves faster execution times by orders of magnitude
Simfluence: Modeling the Influence of Individual Training Examples by Simulating Training Runs
Training data attribution (TDA) methods offer to trace a model's prediction
on any given example back to specific influential training examples. Existing
approaches do so by assigning a scalar influence score to each training
example, under a simplifying assumption that influence is additive. But in
reality, we observe that training examples interact in highly non-additive ways
due to factors such as inter-example redundancy, training order, and curriculum
learning effects.
To study such interactions, we propose Simfluence, a new paradigm for TDA
where the goal is not to produce a single influence score per example, but
instead a training run simulator: the user asks, ``If my model had trained on
example , then , ..., then , how would it behave on
?''; the simulator should then output a simulated training run, which
is a time series predicting the loss on at every step of the
simulated run. This enables users to answer counterfactual questions about what
their model would have learned under different training curricula, and to
directly see where in training that learning would occur.
We present a simulator, Simfluence-Linear, that captures non-additive
interactions and is often able to predict the spiky trajectory of individual
example losses with surprising fidelity. Furthermore, we show that existing TDA
methods such as TracIn and influence functions can be viewed as special cases
of Simfluence-Linear. This enables us to directly compare methods in terms of
their simulation accuracy, subsuming several prior TDA approaches to
evaluation. In experiments on large language model (LLM) fine-tuning, we show
that our method predicts loss trajectories with much higher accuracy than
existing TDA methods (doubling Spearman's correlation and reducing mean-squared
error by 75%) across several tasks, models, and training methods